Search CORE

52 research outputs found

Target selection and annotation for the structural genomics of the amidohydrolase and enolase superfamilies

Author: A Andreeva
A Sakai
A Weeks
AE Todd
Andrej Sali
C Nowlan
CH Wu
CM Seibert
D Vitkup
DA Benson
DL Wheeler
EF Pettersen
F Melo
Frank M. Raushel
H Berman
HJ Imker
J Akana
J Gough
J Lee
J. Michael Sauder
JA Gerlt
JA Gerlt
JA Gerlt
JB Bonanno
JB Thoden
JC Hermann
JC Hermann
JC Norvell
JC Venter
JE Vick
JE Vick
Jeffrey B. Bonanno
Jennifer J. Seffernick
JF Rakus
JJ Irwin
John A. Gerlt
L Holm
L Song
L Williams
Libusha Kelly
Margaret E. Glasner
Mark R. Chance
Matthew P. Jacobson
ME Glasner
ME Glasner
ME Glasner
N Eswar
N Nagano
Narayanan Eswar
P Shannon
Patricia C. Babbitt
PC Babbitt
R Marti-Arbona
R Marti-Arbona
R Marti-Arbona
R Sanchez
R Tyagi
Ranyee Chiang
RS Hall
RZ Liao
SC Almo
SC Pegg
SD Brown
SF Altschul
Shoshana D. Brown
SL Schafer
Stephen K. Burley
Steven C. Almo
Subramanyam Swaminathan
TN Porter
TT Nguyen
U Pieper
Ursula Pieper
WS Yew
WS Yew
WS Yew
Xiaojing Zheng
Y Li
Publication venue: Springer Netherlands
Publication date: 01/01/2009
Field of study

To study the substrate specificity of enzymes, we use the amidohydrolase and enolase superfamilies as model systems; members of these superfamilies share a common TIM barrel fold and catalyze a wide range of chemical reactions. Here, we describe a collaboration between the Enzyme Specificity Consortium (ENSPEC) and the New York SGX Research Center for Structural Genomics (NYSGXRC) that aims to maximize the structural coverage of the amidohydrolase and enolase superfamilies. Using sequence- and structure-based protein comparisons, we first selected 535 target proteins from a variety of genomes for high-throughput structure determination by X-ray crystallography; 63 of these targets were not previously annotated as superfamily members. To date, 20 unique amidohydrolase and 41 unique enolase structures have been determined, increasing the fraction of sequences in the two superfamilies that can be modeled based on at least 30% sequence identity from 45% to 73%. We present case studies of proteins related to uronate isomerase (an amidohydrolase superfamily member) and mandelate racemase (an enolase superfamily member), to illustrate how this structure-focused approach can be used to generate hypotheses about sequence–structure–function relationships

Crossref

Springer - Publisher Connector

PubMed Central

eScholarship - University of California

FLORA: a novel method to predict protein function from structure in diverse superfamilies

Predicting protein function from structure remains an active area of interest, particularly for the structural genomics initiatives where a substantial number of structures are initially solved with little or no functional characterisation. Although global structure comparison methods can be used to transfer functional annotations, the relationship between fold and function is complex, particularly in functionally diverse superfamilies that have evolved through different secondary structure embellishments to a common structural core. The majority of prediction algorithms employ local templates built on known or predicted functional residues. Here, we present a novel method (FLORA) that automatically generates structural motifs associated with different functional sub-families (FSGs) within functionally diverse domain superfamilies. Templates are created purely on the basis of their specificity for a given FSG, and the method makes no prior prediction of functional sites, nor assumes specific physico-chemical properties of residues. FLORA is able to accurately discriminate between homologous domains with different functions and substantially outperforms (a 2–3 fold increase in coverage at low error rates) popular structure comparison methods and a leading function prediction method. We benchmark FLORA on a large data set of enzyme superfamilies from all three major protein classes (α, β, αβ) and demonstrate the functional relevance of the motifs it identifies. We also provide novel predictions of enzymatic activity for a large number of structures solved by the Protein Structure Initiative. Overall, we show that FLORA is able to effectively detect functionally similar protein domain structures by purely using patterns of structural conservation of all residues

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

UCL Discovery

PubMed Central

Evolutionarily Conserved Substrate Substructures for Automated Annotation of Enzyme Superfamilies

Author: A Aharoni
AE Todd
AG Murzin
Andrej Sali
AS Mildvan
C Kalyanaraman
C Steinbeck
CM Seibert
CS Riesenfeld
CT Porter
D Weininger
DJ Weininger
DM Schmidt
DM Schmidt
GL Holliday
HM Holden
I Friedberg
I Nobeli
I Schomburg
I Shah
J Barthelmes
JA Gerlt
JA Gerlt
JA Gerlt
JA Gerlt
JC Hermann
JC Hermann
JJ Diaz-Mejia
K Tipton
KA Frazer
KN Allen
L Holm
L Song
M Ashburner
M Bashton
M Kotera
MA Marti-Renom
ME Glasner
ME Glasner
MJ Bessman
MJ Keiser
N Nagano
NH Horowitz
NH Horowitz
NM O'Boyle
Patricia C. Babbitt
PC Babbitt
PC Babbitt
PC Babbitt
R Alves
RA Nagatani
Ranyee A. Chiang
Robert B. Russell
S Light
S Schmidt
SC Pegg
SC Rison
SD Copley
TL O'Loughlin
WR Pearson
Publication venue: Public Library of Science
Publication date: 01/08/2008
Field of study

The evolution of enzymes affects how well a species can adapt to new environmental conditions. During enzyme evolution, certain aspects of molecular function are conserved while other aspects can vary. Aspects of function that are more difficult to change or that need to be reused in multiple contexts are often conserved, while those that vary may indicate functions that are more easily changed or that are no longer required. In analogy to the study of conservation patterns in enzyme sequences and structures, we have examined the patterns of conservation and variation in enzyme function by analyzing graph isomorphisms among enzyme substrates of a large number of enzyme superfamilies. This systematic analysis of substrate substructures establishes the conservation patterns that typify individual superfamilies. Specifically, we determined the chemical substructures that are conserved among all known substrates of a superfamily and the substructures that are reacting in these substrates and then examined the relationship between the two. Across the 42 superfamilies that were analyzed, substantial variation was found in how much of the conserved substructure is reacting, suggesting that superfamilies may not be easily grouped into discrete and separable categories. Instead, our results suggest that many superfamilies may need to be treated individually for analyses of evolution, function prediction, and guiding enzyme engineering strategies. Annotating superfamilies with these conserved and reacting substructure patterns provides information that is orthogonal to information provided by studies of conservation in superfamily sequences and structures, thereby improving the precision with which we can predict the functions of enzymes of unknown function and direct studies in enzyme engineering. Because the method is automated, it is suitable for large-scale characterization and comparison of fundamental functional capabilities of both characterized and uncharacterized enzyme superfamilies

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

eScholarship - University of California

Recombining Low Homology, Functionally Rich Regions of Bacterial Subtilisins by Combinatorial Fragment Exchange

Author: A Andreeva
AM Simm
Annalisa Pastore
AV Teplyakov
AY Lee
C Betzel
C Chothia
C Neylon
CA Smith
CS Wright
D. Dafydd Jones
DD Jones
DD Jones
DJ Neidhart
H Gron
H Zhao
HS Toogood
IS Povolotskaya
J Eder
J Minshull
J Vevodova
JA Wells
JA Wells
JE Ness
JJ Perona
KM Stott
MA DePristo
ME Glasner
MM Meyer
N Guex
N Tindbaek
N Tokuriki
NC Rockwell
P Carter
P Carter
PN Bryan
PN Bryan
R Bott
R Gupta
RJ Siezen
S Bershtein
S Ewert
SC Jain
W Stemmer
Publication venue: Public Library of Science
Publication date: 01/01/2011
Field of study

Combinatorial fragment exchange was utilised to recombine key structural and functional low homology regions of bacilli subtilisins to generate new active hybrid proteases with altered substrate profiles. Up to six different regions comprising mostly of loop residues from the commercially important subtilisin Savinase were exchanged with the structurally equivalent regions of six other subtilisins. The six additional subtilisins derive from diverse origins and included thermophilic and intracellular subtilisins as well as other academically and commercially relevant subtilisins. Savinase was largely tolerant to fragment exchange; rational replacement of all six regions with 5 of 6 donating subtilisin sequences preserved activity, albeit reduced compared to Savinase. A combinatorial approach was used to generate hybrid Savinase variants in which the sequences derived from all seven subtilisins at each region were recombined to generate new region combinations. Variants with different substrate profiles and with greater apparent activity compared to Savinase and the rational fragment exchange variants were generated with the substrate profile exhibited by variants dependent on the sequence combination at each region

CiteSeerX

Public Library of Science (PLOS)

Crossref

Online Research @ Cardiff

Directory of Open Access Journals

PubMed Central

Exploring the Evolution of Novel Enzyme Functions within Structurally Defined Protein Superfamilies

Author: A Andreeva
AE Todd
AL Cuff
Alison L. Cuff
AU Tamuri
BE Engelhardt
BH Dessailly
C Chothia
CA Orengo
Christine A. Orengo
DA Benson
DE Almonacid
DM Schmidt
DS Tawfik
G Caetano-Anolles
GA Reeves
Gemma L. Holliday
GJ Bartlett
GJ Binford
GL Holliday
GL Holliday
GL Holliday
HS Park
I Nobeli
Ian Sillitoe
J Ruan
J Shi
Janet M. Thornton
JP Overington
K Katoh
LH Greene
M Bashton
M Groll
M Xu
ME Glasner
MT Murakami
N Furnham
N Gallastegui
Nicholas Furnham
NJ Mulder
O Khersonsky
PF Gherardini
PJ O'Brien
Roman A. Laskowski
SC Pegg
SD Brown
SF Altschul
W Heinemeyer
WS Valdar
Yanay Ofran
Publication venue: Public Library of Science
Publication date: 01/01/2011
Field of study

In order to understand the evolution of enzyme reactions and to gain an overview of biological catalysis we have combined sequence and structural data to generate phylogenetic trees in an analysis of 276 structurally defined enzyme superfamilies, and used these to study how enzyme functions have evolved. We describe in detail the analysis of two superfamilies to illustrate different paradigms of enzyme evolution. Gathering together data from all the superfamilies supports and develops the observation that they have all evolved to act on a diverse set of substrates, whilst the evolution of new chemistry is much less common. Despite that, by bringing together so much data, we can provide a comprehensive overview of the most common and rare types of changes in function. Our analysis demonstrates on a larger scale than previously studied, that modifications in overall chemistry still occur, with all possible changes at the primary level of the Enzyme Commission (E.C.) classification observed to a greater or lesser extent. The phylogenetic trees map out the evolutionary route taken within a superfamily, as well as all the possible changes within a superfamily. This has been used to generate a matrix of observed exchanges from one enzyme function to another, revealing the scale and nature of enzyme evolution and that some types of exchanges between and within E.C. classes are more prevalent than others. Surprisingly a large proportion (71%) of all known enzyme functions are performed by this relatively small set of 276 superfamilies. This reinforces the hypothesis that relatively few ancient enzymatic domain superfamilies were progenitors for most of the chemistry required for life

Public Library of Science (PLOS)

CiteSeerX

Crossref

LSHTM Research Online

Directory of Open Access Journals

PubMed Central

UCL Discovery

FigShare

Characterization and Comparison of the Tissue-Related Modules in Human and Mouse

Author: AI Su
Art F. Y. Poon
B Zhang
Bing Su
BY Liao
BY Liao
D Deutscher
D Segre
D Smedley
DT Odom
E Hubbell
E Ravasz
E Segal
EA Glazov
EI Boyle
G Bejerano
G Chartrand
GM Rubin
H Ge
H Kitano
H Ramsay
HB Fraser
HB Fraser
J Ihmels
J Ihmels
J Ihmels
J Ihmels
J Yang
JD Thompson
M Ashburner
MA Pujana
MB Eisen
MD Wilson
ME Glasner
OR Bininda-Emonds
P Khaitovich
P Pamilo
P Tamayo
P Tsaparas
PM Kim
R Nielsen
Ruolin Yang
S Bergmann
S Bergmann
S Tavazoie
SA Rifkin
TC Freeman
W Enard
WS Cleveland
XJ Yu
Y Guan
YQ Wang
Z Wang
Z Wu
Z Yang
Z Yang
Z Yang
Publication venue: Public Library of Science
Publication date: 22/07/2010
Field of study

BACKGROUND: Due to the advances of high throughput technology and data-collection approaches, we are now in an unprecedented position to understand the evolution of organisms. Great efforts have characterized many individual genes responsible for the interspecies divergence, yet little is known about the genome-wide divergence at a higher level. Modules, serving as the building blocks and operational units of biological systems, provide more information than individual genes. Hence, the comparative analysis between species at the module level would shed more light on the mechanisms underlying the evolution of organisms than the traditional comparative genomics approaches. RESULTS: We systematically identified the tissue-related modules using the iterative signature algorithm (ISA), and we detected 52 and 65 modules in the human and mouse genomes, respectively. The gene expression patterns indicate that all of these predicted modules have a high possibility of serving as real biological modules. In addition, we defined a novel quantity, "total constraint intensity," a proxy of multiple constraints (of co-regulated genes and tissues where the co-regulation occurs) on the evolution of genes in module context. We demonstrate that the evolutionary rate of a gene is negatively correlated with its total constraint intensity. Furthermore, there are modules coding the same essential biological processes, while their gene contents have diverged extensively between human and mouse. CONCLUSIONS: Our results suggest that unlike the composition of module, which exhibits a great difference between human and mouse, the functional organization of the corresponding modules may evolve in a more conservative manner. Most importantly, our findings imply that similar biological processes can be carried out by different sets of genes from human and mouse, therefore, the functional data of individual genes from mouse may not apply to human in certain occasions

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

The FGGY carbohydrate kinase family : insights into the evolution of functional specificities

Author: A Osterman
A Vendeville
Adam Godzik
AE Todd
AE Todd
AM Schnoes
Andrei Osterman
B Reva
BE Engelhardt
BG Magor
CA Bonner
CA Orengo
Christos A. Ouzounis
CM Seibert
D Grueninger
D Wu
DA Lee
DA Rodionov
E Di Luccio
G Casari
GE Crooks
HM Berman
I Letunic
Irina Rodionova
JA Capra
JA Capra
JA Gerlt
JH Hurley
JH Hurley
JI Yeh
K Sjolander
K Ye
KB Xavier
LA David
M Ormo
M Pachkov
ME Glasner
MN Price
MV Omelchenko
N Krishnamurthy
Olga Zagnitko
OV Kalinina
P Shannon
R Overbeek
RC Edgar
RC Edgar
RD Finn
RK Aziz
S Cheek
SS Hannenhalli
TA Tatusova
TT Nguyen
W-D Fessner
Y Zhang
Ying Zhang
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/12/2011
Field of study

© The Author(s), 2011. This article is distributed under the terms of the Creative Commons Attribution License. The definitive version was published in PLoS Computational Biology 7 (2011): e1002318, doi:10.1371/journal.pcbi.1002318.Function diversification in large protein families is a major mechanism driving expansion of cellular networks, providing organisms with new metabolic capabilities and thus adding to their evolutionary success. However, our understanding of the evolutionary mechanisms of functional diversity in such families is very limited, which, among many other reasons, is due to the lack of functionally well-characterized sets of proteins. Here, using the FGGY carbohydrate kinase family as an example, we built a confidently annotated reference set (CARS) of proteins by propagating experimentally verified functional assignments to a limited number of homologous proteins that are supported by their genomic and functional contexts. Then, we analyzed, on both the phylogenetic and the molecular levels, the evolution of different functional specificities in this family. The results show that the different functions (substrate specificities) encoded by FGGY kinases have emerged only once in the evolutionary history following an apparently simple divergent evolutionary model. At the same time, on the molecular level, one isofunctional group (L-ribulokinase, AraB) evolved at least two independent solutions that employed distinct specificity-determining residues for the recognition of a same substrate (L-ribulose). Our analysis provides a detailed model of the evolution of the FGGY kinase family. It also shows that only combined molecular and phylogenetic approaches can help reconstruct a full picture of functional diversifications in such diverse families.This study was funded by NIH and DOE grants

Public Library of Science (PLOS)

Crossref

Woods Hole Open Access Server

Directory of Open Access Journals

PubMed Central

eScholarship - University of California

Pesquisa de anticorpos anti Toxoplasma gondii em fluidos intra-oculares (humor vítreo e humor aquoso) de pacientes com toxoplasmose ocular, na cidade de Belém, PA

Author: Abreu MT
Camargo ME
Canosa A
Cléa Nazaré Bichara
Ediclei Lima do Carmo
Edmundo Frota Almeida
Garweg JG
Glasner PD
Hay J
Katina JH
Marinete Marins Póvoa
Montoya JG
Nussenblatt RB
Ongkosuwito JV
Oréfice F.
Perkins ES
Ronday MJH
Silveira C
Tenter AM
Turunen HJ
Publication venue: 'FapUNIFESP (SciELO)'
Publication date
Field of study

Crossref

A Measure of the Promiscuity of Proteins and Characteristics of Residues in the Vicinity of the Catalytic Site That Regulate Promiscuity

Promiscuity, the basis for the evolution of new functions through ‘tinkering’ of residues in the vicinity of the catalytic site, is yet to be quantitatively defined. We present a computational method Promiscuity Indices Estimator (PROMISE) - based on signatures derived from the spatial and electrostatic properties of the catalytic residues, to estimate the promiscuity (PromIndex) of proteins with known active site residues and 3D structure. PromIndex reflects the number of different active site signatures that have congruent matches in close proximity of its native catalytic site, the quality of the matches and difference in the enzymatic activity. Promiscuity in proteins is observed to follow a lognormal distribution (μ = 0.28, σ = 1.1 reduced chi-square = 3.0E-5). The PROMISE predicted promiscuous functions in any protein can serve as the starting point for directed evolution experiments. PROMISE ranks carboxypeptidase A and ribonuclease A amongst the more promiscuous proteins. We have also investigated the properties of the residues in the vicinity of the catalytic site that regulates its promiscuity. Linear regression establishes a weak correlation (R2∼0.1) between certain properties of the residues (charge, polar, etc) in the neighborhood of the catalytic residues and PromIndex. A stronger relationship states that most proteins with high promiscuity have high percentages of charged and polar residues within a radius of 3 Å of the catalytic site, which is validated using one-tailed hypothesis tests (P-values∼0.05). Since it is known that these characteristics are key factors in catalysis, their relationship with the promiscuity index cross validates the methodology of PROMISE

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

FigShare

Annotation Error in Public Databases: Misannotation of Molecular Function in Enzyme Superfamilies

Due to the rapid release of new data from genome sequencing projects, the majority of protein sequences in public databases have not been experimentally characterized; rather, sequences are annotated using computational analysis. The level of misannotation and the types of misannotation in large public databases are currently unknown and have not been analyzed in depth. We have investigated the misannotation levels for molecular function in four public protein sequence databases (UniProtKB/Swiss-Prot, GenBank NR, UniProtKB/TrEMBL, and KEGG) for a model set of 37 enzyme families for which extensive experimental information is available. The manually curated database Swiss-Prot shows the lowest annotation error levels (close to 0% for most families); the two other protein sequence databases (GenBank NR and TrEMBL) and the protein sequences in the KEGG pathways database exhibit similar and surprisingly high levels of misannotation that average 5%–63% across the six superfamilies studied. For 10 of the 37 families examined, the level of misannotation in one or more of these databases is >80%. Examination of the NR database over time shows that misannotation has increased from 1993 to 2005. The types of misannotation that were found fall into several categories, most associated with “overprediction” of molecular function. These results suggest that misannotation in enzyme superfamilies containing multiple families that catalyze different reactions is a larger problem than has been recognized. Strategies are suggested for addressing some of the systematic problems contributing to these high levels of misannotation

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central